Dublin City University at the TweetMT 2015 Shared Task
نویسندگان
چکیده
We describe our participation in TweetMT for three language pairs in both directions: Spanish from/to Catalan, Basque and Portuguese. We used a range of techniques: statistical and rule-based MT, morph segmentation, data selection with ParFDA and system combination. As for resources, our focus was on crawling vast amounts of tweets to perform monolingual domain adaptation. Our system was the best of all systems submitted for five out of the six language directions.
منابع مشابه
EHU at TweetMT: Adapting MT Engines for Formal Tweets
This paper describes the participation of the IXA group from the UPV/EHU (University of the Basque Country) in the TweetMT shared task at the SEPLN-2015 conference. We have adapted existing MT engines for the es-eu and eu-es pairs, obtaining good results (better than other experiments reported in previous work). Three main aspects are described: resource compilation, engine adaptation and results.
متن کاملOverview of TweetMT: A Shared Task on Machine Translation of Tweets at SEPLN 2015
This article presents an overview of the shared task that took place as part of the TweetMT workshop held at SEPLN 2015. The task consisted in translating collections of tweets from and to several languages. The article outlines the data collection and annotation process, the development and evaluation of the shared task, as well as the results achieved by the participants.
متن کاملThe DCU Discourse Parser: A Sense Classification Task
This paper describes the discourse parsing system developed at Dublin City University for participation in the CoNLL 2015 shared task. We participated in two tasks: a connective and argument identification task and a sense classification task. This paper focuses on the latter task and especially the sense classification for implicit connectives.
متن کاملThe UPC TweetMT participation: Translating Formal Tweets Using Context Information
In this paper, we describe the UPC systems that participated in the TweetMT shared task. We developed two main systems that were applied to the Spanish–Catalan language pair: a state-of-the-art phrase-based statistical machine translation system and a context-aware system. In the second approach, we define the “context” for a tweet as the tweets of a user produced in the same day, and also, we ...
متن کاملExploration of Feature Combination in Geo-visual Ranking for Visual Content-based Location Prediction
DAY 1: FIRST MORNING SESSION 9:00–9:15 Opening Brief words of welcome by Martha Larson and Gareth Jones 9:15–10:15 Search and Hyperlinking of Television Content Chair: Maria Eskevich (Dublin City University, Ireland) I. (20 min.) Search and Hyperlinking Task overview: The Search and Hyperlinking Task at MediaEval 2013 (presenter: Robin Aly, University of Twente, Netherlands) II. (10 min.) Linke...
متن کامل